Today’s introductory activity aims to ensure that you feel comfortable working with your local copies of your GitHub repo and updating your remote (located on the website) version of that repo. The ultimate goal is to update the ID.txt document that we created last class period with your name and student ID number.
The goal: edit ID.txt on your local machine to add your name (First, Preferred, Last) and student ID, then get that edited version on your remote repo (located on the website).
Work in groups of 3-4 to determine the commands you would use to:
Throughout this course, you will be creating a course portfolio in R markdown (Rmd). This is a text-based language ideal for making reproducible reports of all your analyses in RStudio. In fact, this is what we’ve made all the Data Science Friday assignments and tutorials in! It also works very well with GitHub because it is entirely text-based, so we can track and control versions.
In order to complete your portfolio in Rmd, we must first orient ourselves in RStudio.
RStudio is a graphical user interface (GUI) for R, which is a command line-esque program. When you open RStudio, you will see several parts of the program.
On the left you have the console, which is where you will run commands. On the top-right, you have several tabs. We will us the Environment to see data that is loaded in R. On the bottom-right, you also have several tabs, which will come into play during this tutorial.
To change the look of RStudio, you can go to Tools -> Global Options -> Appearance and select colors, font size, etc. If you plan to be working for longer periods, I suggest choosing a dark background color scheme to save your computer battery and your eyes.
Projects in RStudio help us stay organized. When you use a project, RStudio creates a .Rproj file that links all of your files and outputs to the project folder. When you import data, R automatically looks for the file in the project folder instead of you having to specify a full file path like /Users/username/Desktop/. R also automatically saves any output to the project folder. Finally, projects allow us to save your R environment in .RData so that when we close RStudio and then re-open it, we can start right where we left off without re-importing any data or re-calculating any intermediate steps. We will see these project features as we begin to work with data in RStudio.
To begin and stay organized, we will start a new R project, File -> New Project
Create this project in your MICB425_portfolio repo (which we created last time).
Another important step to stay organized is R scripts. These are text files in which to save your R code. If you just type code into the Console (left side in RStudio) and run it, it will not be saved anywhere. Thus, when you come back to a project, you will not know what you did in the past.
Since we will be creating portfolios in R markdown, we will use this version of an R script to save code in addition to adding formatting like headers, tables, etc. useful in your portfolio.
Let’s create a new R markdown to save your work. Go to File -> New File -> R Markdown. Input a title, your name and student ID, and select the html output format.
Now, you will see the left side split between your R markdown (top) and the Console (bottom).
Your R markdown is automatically populated with some example code. You can delete everything outside of the top text bracketed by “- - -”.
Note that when we save our R markdown, it is automatically saved in our project folder!
Please save your R markdown as StudentID_MICB425_portfolio.Rmd
Now that you have an .Rproj and .Rmd, be sure to add, commit, and push them!
Knitting is the process by which R takes an R markdown, evaluates all the text and code, and creates a beautiful report in PDF, html, or Word format. Since we set our output to html when we made our R markdown, when we click ‘Knit’ above the markdown, it will knit to an html. Since we are in a project, this file is created and saved in your project folder.
When you knit a document for the first time, you will be asked to install some packages. Please allow all of these installs within RStudio.
You can get help with any function in R by inputting ?function name into the Console. This will open a window in the bottom right under the Help tab with information on that function, including input options and example code.
?mean
If you are unsure of the exact function name, you can expand the help search to include all descriptions with ??function name. For example, if you did not remember that R uses mean to calculate an average, you could use ??average. This will bring up a list of functions with ‘average’ somewhere in the name or description for you to choose from.
??average
The YAML header is the area between the “- - -” at the top of an R markdown.
This is written in YAML (YAML Ain’t Markup Language), a human friendly data serialization standard for all programming languages. Basically, YAML allows us to easily read the header both within RStudio and after we knit into a report.
In general, we will not edit the header as RStudio automatically populates it with what we need. For example, your header should have your title, name, date, and output type as a result of the information you input when creating the R markdown.
The one thing we will change is the date so that R will automatically update the date every time you create a report.
Change the date section of your YAML header to be
date: "version `r format(Sys.time(), '%B %d, %Y')`"
When you knit, this looks like
R markdown designates headers with # with each # pushing the header down one level.
# Example header 1
## Example header 2
## Example header 3
You should utilize the portfolio template provided in your MICB425_materials repo to see how module 01 headers should be organized. Please also follow a similar format for later modules in this course.
You can insert a table of contents by adding toc: yes under the output part of your header. This will take all of your headers and organize them into a table of contents automatically.
---
title: "MICB425 portfolio"
author: "Kim Dill-McFarland (student ID)"
date: "version `r format(Sys.time(), '%B %d, %Y')`"
output: pdf_document
toc: yes
---
You can treat a markdown like any text document outside of special formatting like # for headers. Simply type away and plain text will appear in your portfolio.
You can modify this text by surrounding it with special characters like below.
*italicized text*
These characters include
* for italics** for bold^ for superscript~~ for Ending a line of text with 2 spaces results in a line break like
line 1
line 2
while 2 hard returns (Enter) yields
line 1
line 2
Note that using 1 hard return (Enter) does not yield a line break at all.
line 1 line 2
I will say that the one thing not easy to format in R markdown is tables. A standard table looks like
Row names | Data 1 | Data 2
--------- | ------ | ------
Row 1 | 0 | 1
Row 2 | 1 | 0
to make
| Row names | Data 1 | Data 2 |
|---|---|---|
| Row 1 | 0 | 1 |
| Row 2 | 1 | 0 |
Please note that the | lines do not all have to line up to make a table, but it makes it much easier to read in the R markdown before knitting.
Feel free to explore other table options through packages like
You can simply paste in the URL https://github.com/EDUCE-UBC/MICB425 or link it to a word or phrase with
[text](https://github.com/EDUCE-UBC/MICB425)
If your image is online:

If your image is on your computer:

If your image is in your project folder, you don’t need the entire file path.

All of the above will insert the image within your knit document. By default, the image will be whatever size the file is or span the entire page if it is larger. You can change the size by modifying images as below.
{width=50%}
Arguably the most powerful aspect of working in R markdown (as opposed to other markdown formats or a simple text file) is the ability to include analyses from R in-line within the document.
This code is designated with
```{r}
Code, code, code
```
When code chunks are knit, R evaluates the code and prints both the code and the output into your report. Note that within a code chunk, comments (i.e. text, notes, etc.) are designated by the #.
#Create 2 lists of numbers
list1 = c(1,2,3)
list2 = c(4,5,6)
#Plot list1 vs. list2
plot(x=list1, y=list2)
These code chunks can also be customized in what they output (code only, no warnings, figure size, etc.).
When you modify a chunk, you add parameters within {r } such as
```{r title, message=TRUE, warning=FALSE, fig.width=4}
```
For a more complete list of options, see here.
As you work in RStudio, you can check that your code works by
This document demonstrates just some of the functions in an R markdown. We will continue to learn options as your portfolio progresses in this course and you are encouraged to seek out code that works best for your portfolio.